Individual Presentation: Object Recognition Model
Deep Learning (CNNs & Transfer Learning)
Artificial Intelligence (AI) & Machine Learning
AI and machine learning models have been a significant part for
the advancement of some fields. AI has been a great addition to
object recognition models and people have started utilising to
advance their fields accordingly.
Object detection models are already being used in healthcare
to advance the field and allow doctors identify sooner disease
and anomalies in patients scans (Elhanashi et al., 2025).
There are many real industrial use cases of deep learning
being applied for object detection and classification (Wang et
al., 2021).
Before AI With AI
Manual inspections of datasets Automated inspections
Time consuming Quicker results
Expensive Reduced possibility error of human
error
Prone to human error Requires knowledge and
understanding of the field
Table 1. Key characteristics of visual inspections before and after AI and deep learning
The object recognition model:
Deep learning (CNNs & Transfer Learning).
Convolutional Neural Networks, Data augmentation and fine-tuning pre-trained models.
Data preparation & validation set
Data preparation:
CIFAR-10 dataset (Krizhevsky, Nair and Hinton, 2009).
60,000 images 32x32.
10 classes (airplane, automobile, bird, cat, deer, dog,
frog, horse, ship, truck) 6,000 images each.
(45,000 training set + 5,000 validation set) (10,000
testing set.)
Data augmentation:
Provides variability to the training dataset
Flipping.
Rotation.
Shifting.
Improves generalization and prevents overfitting
Validation set:
5,000 validation set (500 of each class)
Validation set is a crucial set to have in such models
(Xu and Goodacre, 2018)
Enables the model to self-reflect for overfitting and
adjust the weight and learning rate for the previous
epoch to next epochs
Data preparation purpose:
Model would run effectively on the dataset used and
architectures chosen for this project
Model Selection & Architecture & Hyperparameters
Model Selection:
Deep Learning (CNNs & Transfer Learning)
Focus areas: Convolutional Neural Networks, data augmentation and fine-
tuning pre-trained models
Custom CNN architecture:
Effective on small images and datasets (Alam et al., 2024)
Automatically learn spatial hierarchies of features (Geirhos et al., 2018)
Flexibility to build specific for the dataset requirements
MobileNetV2 transfer learning:
Pretrained on ImageNet
Compatible with CIFAR-10 dataset with minimal retraining
Hyperparameters:
Batch-size, epochs and learning rate were chosen from a more generalized
point of view
Performance Metrics & Training Strategy
Performance Metrics:
Accuracy of the models: 79.84% baseline CNN & 71.07% MobileNetV2
Precision, Recall, F1-score (photo of the results next slide) to identify on
which classes the models performed the best
Accuracy, Loss, Validation Accuracy, Validation Loss (code next slide) check
on overfitting by the models
Training Strategy:
CNN:
25 epochs
Batch-size 64
Adam optimizer for adaptive learning rates (code results of training next slide)
Data augmentation to prevent overfitting, help the model generalize better
MobileNetV2:
Base initially frozen
15 epochs
Then, fine-tuning of top layers at
a learning rate of 0.00001 with
25 epochs
Performance Metrics & Training Strategy
Comparative Discussion & Conclusion & Lessons Learned
Baseline CNN
Transfer Learning
MobileNetV2
Accuracy 79.84%
Accuracy 71.07%
Longer training
Faster convergence
Needs regularization for
overfitting
Computational efficiency
Can be specific trained for
small low
-resolution dataset
Adapts better to larger image
inputs
Flexibility
Generalization
Table 2. Comparing pros and cons of baseline CNN and MobileNetV2 Key points & Lessons learned:
Deep learning models can be very effective in object
recognition model
Crucial for the success of the project to choose
architectures that are suited for the dataset
The validation set plays a key role to monitoring
generalization
Hyperparameters play a crucial to the models learning
curve
No single model outperforms every other in all cases
(Abou Bake and Handmann, 2024), having a successful
model means choosing the right architecture and
hyperparameters depending on the dataset being used
Suggestion for future work:
Repeating the project with a custom CNN and transfer learning approach for a larger dataset.
It can give more insights for the strengths and weaknesses of each approach.
Thank you for listening to my presentation
References
Abou Baker, N. and Handmann, U., 2024. One size does not fit all in evaluating model selection scores for image classification. Scientific
Reports, 14(1), p.30239.
Alam, T.S., Jowthi, C.B. and Pathak, A., 2024. Comparing pre-trained models for efficient leaf disease detection: a study on custom
CNN. Journal of Electrical Systems and Information Technology, 11(1), p.12.
Elhanashi, A., Saponara, S., Zheng, Q., Almutairi, N., Singh, Y., Kuanar, S., Ali, F., Unal, O. and Faghani, S., 2025. AI-Powered Object
Detection in Radiology: Current Models, Challenges, and Future Direction. Journal of Imaging, 11(5), p.141.
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A. and Brendel, W., 2018, November. ImageNet-trained CNNs are
biased towards texture; increasing shape bias improves accuracy and robustness. In International conference on learning
representations.
Krizhevsky, A., Nair, V. and Hinton, G. (2009) CIFAR-10 (Canadian Institute for Advanced Research) dataset. Toronto: University of
Toronto. Available at: https://www.cs.toronto.edu/~kriz/cifar.html (Accessed: 9 October 2025).
Wang, D., Wang, J.G. and Xu, K., 2021. Deep learning for object detection, classification and tracking in industry applications. Sensors
(Basel, Switzerland),21(21), p.7349.
Xu, Y. and Goodacre, R., 2018. On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic
sampling for estimating the generalization performance of supervised learning. Journal of analysis and testing,2(3), pp.249-262.